Investigations on speech recognition systems for low-resource dialectal Arabic–English code-switching speech
نویسندگان
چکیده
Code-switching (CS), defined as the mixing of languages in conversations, has become a worldwide phenomenon. The prevalence CS been recently met with growing demand and interest to build automatic speech recognition (ASR) systems. In this paper, we present our work on code-switched Egyptian Arabic–English ASR. We first contribute filling huge gap resources by collecting, analyzing publishing spontaneous corpus. ASR systems using DNN-based hybrid Transformer-based end-to-end models. thorough comparison between both approaches under setting low-resource, orthographically unstandardized, morphologically rich language pair. show that while achieve comparable overall results, have complementary strengths. can be improved combining outputs two propose several effective system combination approaches, where hypotheses are merged sentence- word-levels. Our result WER relative improvement 4.7%, over baseline performance 32.1% WER. case intra-sentential sentences, 4.8%. best performing achieves 30.6% ArzEn test set.
منابع مشابه
Discriminative pronunciation modeling for dialectal speech recognition
Speech recognizers are typically trained with data from a standard dialect and do not generalize to non-standard dialects. Mismatch mainly occurs in the acoustic realization of words, which is represented by acoustic models and pronunciation lexicon. Standard techniques for addressing this mismatch are generative in nature and include acoustic model adaptation and expansion of lexicon with pron...
متن کاملCode-Switching speech recognition for closely related languages
This work presents an approach to recognition of multispeaker conversational speech with code-switching between Ukrainian and Russian languages. Both inter-sentential and intra-sentential code-switching is handled. The approach takes into account peculiarities of phonetic systems of the closely related Russian and Ukrainian languages. A crosslingual LVCSR system is developed. The acoustic model...
متن کاملDialectal Chinese Speech Recognition : Final Report
†Richard Sproat, University of Illinois (Thomas) Fang Zheng, Tsinghua University Liang Gu, IBM Jing Li, Tsinghua University Yanli Zheng, University of Illinois Yi Su, Johns Hopkins University Haolang Zhou, Johns Hopkins University Philip Bramsen, MIT David Kirsch, Lehigh University Izhak Shafran, Johns Hopkins University Stavros Tsakalidis, Johns Hopkins University Rebecca Starr, Stanford Unive...
متن کاملScandinavia INVESTIGATIONS ON CONVERSATIONAL SPEECH RECOGNITION
Automatic speech recognition of real-life conversational speech is a precondition for building natural human-centered man-machine interfaces. Being able to extract speech utterances from real-life broadcast news audio streams and transcribing them with an overall word accuracy of 83% we are still faced with the problem of transcribing true conversational speech in real-life (i.e. bad) backgroun...
متن کاملCombining tandem and hybrid systems for improved speech recognition and keyword spotting on low resource languages
In recent years there has been significant interest in Automatic Speech Recognition (ASR) and Key Word Spotting (KWS) systems for low resource languages. One of the driving forces for this research direction is the IARPA Babel project. This paper examines the performance gains that can be obtained by combining two forms of deep neural network ASR systems, Tandem and Hybrid, for both ASR and KWS...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computer Speech & Language
سال: 2022
ISSN: ['1095-8363', '0885-2308']
DOI: https://doi.org/10.1016/j.csl.2021.101278